-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HWConfig] narrow_range parameter is introduced in hardware config #3196
base: develop
Are you sure you want to change the base?
[HWConfig] narrow_range parameter is introduced in hardware config #3196
Conversation
07a6520
to
a9a0c19
Compare
7134fcb
to
c8b1761
Compare
0b064a7
to
0733ac3
Compare
0733ac3
to
10267b0
Compare
@@ -288,7 +293,7 @@ | |||
{ | |||
"type": "Embedding", | |||
"quantization": { | |||
"weights": ["q8_w_sym", "q8_w_asym"] | |||
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that LPT supports q8_a_ch
for embedding. Please, double check.
As far as I understand, this is applicable only for embedding -> depthwise convolution sub-graph. However, I did not meet such sub-graph in the real word.
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_sym", "q8_a_ch"] | |
"weights": ["q8_w_sym", "q8_w_asym", "q8_a", "q8_a_ch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that even a per-tensor
quantization scheme is applicable here for the Embedding
layer since it contains sensitive weights. We need to verify that this scheme does not introduce performance/accuracy regressions on the known cases.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
versa model validation after the rebase:
versa INT8 | roc_auc_score | Embedding weights config | narrow range | Throughput, FPS | Gather execType | Convs execType |
---|---|---|---|---|---|---|
develop | 91.54 | 'B:8 M:S SGN:S PC:N' | TRUE | 642.18 | jit_avx512_i8 | brgconv_avx512_i8 |
daniil-lyakhov:dl/narrow_range_to_qconfig | 91.54 | B:8 M:S SGN:S PC:N NR:N' | FALSE | 641.19 | jit_avx512_i8 | brgconv_avx512_i8 |
/* | ||
* Narrow range: should NNCF use 2**num_bits quants or 2**num_bits - 1 | ||
*/ | ||
"narrow_range": false | ||
}, | ||
"q8_sym_tnr_-128_127": { // Alias name for set of hyperparameters | ||
"bits": 8, // Number of quantization bits | ||
"mode": "symmetric", // Quantization mode | ||
"granularity": "pertensor", // Granularity: one scale for output tensor | ||
"level_low": -128, // Low quantization level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does NNCF support for level_low
and level_high
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it redundant, since other params define how to calculate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a2e42d5
to
7257b2e
Compare
4b0dc46
to
4d3a892
Compare
987c51d
to
ea5d0fd
Compare
"input_low": -0.9350724220275879, | ||
"input_low": -0.9424352049827576, | ||
"input_high": 0.9350724220275879, | ||
"output_low": -0.9350724220275879, | ||
"output_low": -0.9424352049827576, | ||
"output_high": 0.9350724220275879 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ea5d0fd
to
a85f7b9
Compare
e7a1453
to
cfedaf7
Compare
On top of #3232
Changes
Reason for changes
Related tickets
Tests
post_training_quantization/586/ - Passed
job/weekly/job/ubuntu20_eval/245/ (+ job/ubuntu20_eval/246/) Passed
eval_tf/461/ - Passed
torch_weekly/100/ - Passed
job/nightly/job/torch_nightly/444/ - Passed